AITopics | fixed point

Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation

Neural Information Processing SystemsDec-25-2025, 09:12:20 GMT

Current work in object-centric learning has been motivated by developing learning algorithms that infer independent and symmetric entities from the perceptual input. This often requires the use iterative refinement procedures that break symmetries among equally plausible explanations for the data, but most prior works differentiate through the unrolled refinement process, which can make optimization exceptionally challenging. In this work, we observe that such iterative refinement methods can be made differentiable by means of the implicit function theorem, and develop an implicit differentiation approach that improves the stability and tractability of training such models by decoupling the forward and backward passes. This connection enables us to apply recent advances in optimizing implicit layers to not only improve the stability and optimization of the slot attention module in SLATE, a state-of-the-art method for learning entity representations, but do so with constant space and time complexity in backpropagation and only one additional line of code.

fixed point, object representation, training iterative refinement algorithm, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation

Neural Information Processing SystemsJan-18-2025, 23:55:32 GMT

Current work in object-centric learning has been motivated by developing learning algorithms that infer independent and symmetric entities from the perceptual input. This often requires the use iterative refinement procedures that break symmetries among equally plausible explanations for the data, but most prior works differentiate through the unrolled refinement process, which can make optimization exceptionally challenging. In this work, we observe that such iterative refinement methods can be made differentiable by means of the implicit function theorem, and develop an implicit differentiation approach that improves the stability and tractability of training such models by decoupling the forward and backward passes. This connection enables us to apply recent advances in optimizing implicit layers to not only improve the stability and optimization of the slot attention module in SLATE, a state-of-the-art method for learning entity representations, but do so with constant space and time complexity in backpropagation and only one additional line of code.

implicit differentiation, object representation, training iterative refinement algorithm, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.84)

Add feedback

The Fixed Points of Off-Policy TD

Neural Information Processing SystemsMar-15-2024, 14:57:28 GMT

TD can fail to converge [Boyan, 1994] [Tsitsiklis and Van Roy, 1997] fixed! J. Zico Kolter | The Fixed Points of Off-Policy TD | Poster T6 This work is about fixing off-policy TD Basic idea: reweight samples so that TD solution has quality guarantees (and so that TD converges) Technical idea "filtered" states stationary distribution of policy

fixed point, off-policy td, tsitsiklis and van roy, (3 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.08)

Technology: Information Technology > Artificial Intelligence (0.87)

Add feedback

Convergence of Some Convex Message Passing Algorithms to a Fixed Point

Voracek, Vaclav, Werner, Tomas

arXiv.org Machine LearningMar-7-2024

A popular approach to the MAP inference problem in graphical models is to minimize an upper bound obtained from a dual linear programming or Lagrangian relaxation by (block-)coordinate descent. Examples of such algorithms are max-sum diffusion and sequential tree-reweighted message passing. Convergence properties of these methods are currently not fully understood. They have been proved to converge to the set characterized by local consistency of active constraints, with unknown convergence rate; however, it was not clear if the iterates converge at all (to any single point). We prove a stronger result (which was conjectured before but never proved): the iterates converge to a fixed point of the algorithm. Moreover, we show that they achieve precision $\varepsilon>0$ in $\mathcal{O}(1/\varepsilon)$ iterations. We first prove this for a version of coordinate descent applied to a general piecewise-affine convex objective, using a novel proof technique. Then we demonstrate the generality of this approach by reducing some popular coordinate-descent algorithms to this problem. Finally we show that, in contrast to our main result, a similar version of coordinate descent applied to a constrained optimization problem need not converge.

affine function, algorithm, iteration, (13 more...)

arXiv.org Machine Learning

2403.07004

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Lagrangian Approach to Fixed Points

Neural Information Processing SystemsApr-6-2023, 19:38:21 GMT

We present a new way to derive dissipative, optimizing dynamics from the Lagrangian formulation of mechanics. It can be used to obtain both standard and novel neural net dynamics for optimization problems. To demonstrate this we derive standard descent dynamics as well as nonstan(cid:173) dard variants that introduce a computational attention mechanism.

fixed point, lagrangian approach

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.62)

Add feedback

The Fixed Points of Off-Policy TD

Neural Information Processing SystemsApr-6-2023, 13:13:17 GMT

Off-policy learning, the ability for an agent to learn about a policy other than the one it is following, is a key element of Reinforcement Learning, and in recent years there has been much work on developing Temporal Different (TD) algorithms that are guaranteed to converge under off-policy sampling. It has remained an open question, however, whether anything can be said a priori about the quality of the TD solution when off-policy sampling is employed with function approximation. In general the answer is no: for arbitrary off-policy sampling the error of the TD solution can be unboundedly large, even when the approximator can represent the true value function well. In this paper we propose a novel approach to address this problem: we show that by considering a certain convex subset of off-policy distributions we can indeed provide guarantees as to the solution quality similar to the on-policy case. Furthermore, we show that we can efficiently project on to this convex set using only samples generated from the system. The end result is a novel TD algorithm that has approximation guarantees even in the case of off-policy sampling and which empirically outperforms existing TD methods.

fixed point, off-policy td, td solution, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Fixed points of monotonic and (weakly) scalable neural networks

Piotrowski, Tomasz, Cavalcante, Renato L. G.

arXiv.org Machine LearningJul-1-2021

We derive conditions for the existence of fixed points of neural networks, an important research objective to understand their behavior in modern applications involving autoencoders and loop unrolling techniques, among others. In particular, we focus on networks with nonnegative inputs and nonnegative network parameters, as often considered in the literature. We show that such networks can be recognized as monotonic and (weakly) scalable functions within the framework of nonlinear Perron-Frobenius theory. This fact enables us to derive conditions for the existence of a nonempty fixed point set of the neural networks, and these conditions are weaker than those obtained recently using arguments in convex analysis, which are typically based on the assumption of nonexpansivity of the activation functions. Furthermore, we prove that the shape of the fixed point set of monotonic and weakly scalable neural networks is often an interval, which degenerates to a point for the case of scalable networks. The chief results of this paper are verified in numerical simulations, where we consider an autoencoder-type network that first compresses angular power spectra in massive MIMO systems, and, second, reconstruct the input spectra from the compressed signal.

fixed point, scalable neural network

arXiv.org Machine Learning

2106.16239

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The Fixed Points of Off-Policy TD

Kolter, J. Z.

Neural Information Processing SystemsFeb-14-2020, 23:42:07 GMT

Off-policy learning, the ability for an agent to learn about a policy other than the one it is following, is a key element of Reinforcement Learning, and in recent years there has been much work on developing Temporal Different (TD) algorithms that are guaranteed to converge under off-policy sampling. It has remained an open question, however, whether anything can be said a priori about the quality of the TD solution when off-policy sampling is employed with function approximation. In general the answer is no: for arbitrary off-policy sampling the error of the TD solution can be unboundedly large, even when the approximator can represent the true value function well. In this paper we propose a novel approach to address this problem: we show that by considering a certain convex subset of off-policy distributions we can indeed provide guarantees as to the solution quality similar to the on-policy case. Furthermore, we show that we can efficiently project on to this convex set using only samples generated from the system.

fixed point, off-policy td, td solution, (1 more...)

Neural Information Processing Systems

Genre: Research Report (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

Iterative Neural Networks with Bounded Weights

Piotrowski, Tomasz, Rykaczewski, Krzysztof

arXiv.org Machine LearningAug-19-2019

A recent analysis of a model of iterative neural network in Hilbert spaces established fundamental properties of such networks, such as existence of the fixed points sets, convergence analysis, and Lipschitz continuity. Building on these results, we show that under a single mild condition on the weights of the network, one is guaranteed to obtain a neural network converging to its unique fixed point. We provide a bound on the norm of this fixed point in terms of norms of weights and biases of the network. We also show why this model of a feed-forward neural network is not able to accomodate Hopfield networks under our assumption. Artificial neural networks are becoming indispensible tools in a variety of spheres of human activity and society in general.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

1908.05982

Country: Europe > Poland (0.15)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Quantized Memory-Augmented Neural Networks

Park, Seongsik (Seoul National University) | Kim, Seijoon (Seoul National University) | Lee, Seil (Seoul National University) | Bae, Ho (Seoul National University) | Yoon, Sungroh (Seoul National University)

AAAI ConferencesFeb-8-2018

Memory-augmented neural networks (MANNs) refer to a class of neural network models equipped with external memory (such as neural Turing machines and memory networks). These neural networks outperform conventional recurrent neural networks (RNNs) in terms of learning long-term dependency, allowing them to solve intriguing AI tasks that would otherwise be hard to address. This paper concerns the problem of quantizing MANNs. Quantization is known to be effective when we deploy deep models on embedded systems with limited resources. Furthermore, quantization can substantially reduce the energy consumption of the inference procedure. These benefits justify recent developments of quantized multi layer perceptrons, convolutional networks, and RNNs. However, no prior work has reported the successful quantization of MANNs. The in-depth analysis presented here reveals various challenges that do not appear in the quantization of the other networks. Without addressing them properly, quantized MANNs would normally suffer from excessive quantization error which leads to degraded performance. In this paper, we identify memory addressing (specifically, content-based addressing) as the main reason for the performance degradation and propose a robust quantization method for MANNs to address the challenge. In our experiments, we achieved a computation-energy gain of 22× with 8-bit fixed-point and binary quantization compared to the floating-point implementation. Measured on the bAbI dataset, the resulting model, named the quantized MANN (Q-MANN), improved the error rate by 46% and 30% with 8-bit fixed-point and binary quantization, respectively, compared to the MANN quantized using conventional techniques.

artificial intelligence, machine learning, quantization, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.67)

Technology: